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INTRODUCTION 


The  evolution  of  human  cells  into  malignant  derivatives  is  driven  by  the  aberrant  function 
of  genes  that  positively  and  negatively  regulate  various  aspects  of  the  cancer  phenotype, 
including  altered  responses  to  mitogenic  and  cytostatic  signals,  resistance  to  programmed 
cell  death,  immortalization,  neoangiogenesis,  and  invasion  and  metastasis(1).  The  integrity 
of  these  gene  functions  is  compromised  by  substantial  genetic  and  epigenetic  alterations 
observed  in  most  cancer  cell  genomes.  To  understand  the  tumorigenic  process,  it  is 
imperative  to  identify  and  characterize  the  genes  that  provide  tumor  cells  with  the 
capabilities  requisite  for  their  initiation  and  progression.  However,  the  identities  of  those 
genes  that  contribute  to  the  tumor  phenotype  are  often  concealed  by  the  frequent 
alterations  in  genes  that  play  no  role  in  tumorigenesis. 

Identifying  genes  that  restrain  tumorigenesis  (tumor  suppressors)  has  proven  especially 
challenging  due  to  their  recessive  nature.  Further  complicating  their  discovery  are  the 
multifaceted  mechanisms  by  which  tumor  suppressor  genes  are  inactivated  including 
changes  in  copy  number  and  structure,  point  mutations,  and  epigenetic  alterations  . 
Moreover,  the  mechanisms  by  which  tumor  suppressor  genes  are  inhibited  may  vary 
between  tumors.  With  this  in  mind,  a  variety  of  molecular  and  cytogenetic  technologies 
have  been  used  to  establish  extensive  catalogs  of  genetic  alterations  within  human 
cancers(3,4).  And  while  it  is  generally  accepted  that  highly  recurrent  aberrations  signify 
changes  that  are  important  for  tumor  development,  the  causal  perturbations  underlying 
tumor  genesis  are  often  confounded  by  the  extensive  size  of  alterations  and  the  large 
number  that  are  incidental  to  the  tumor  phenotypes.  As  such,  new  strategies  to  delineate 
genes  with  functional  relevance  to  tumor  initiation  and  development  are  essential  to 
understanding  these  processes. 

One  approach  to  this  problem  involves  the  use  of  in  vitro  models  of  human  cell 
transformation.  In  such  models,  primary  cells  are  transformed  into  tumorigenic 
derivatives  by  the  coexpression  of  cooperating  oncogenes  "  .  These  experimental 
models  have  been  useful  in  delineating  the  minimum  genetic  perturbations  required  for 
transformation  of  various  human  cell  types  as  well  as  evaluating  the  functional 
cooperation  between  a  gene  of  interest  and  a  defined  genetic  context.  To  date,  these 
models  of  human  cell  transformation  have  incorporated  genes  already  implicated  in 
human  tumorigenesis.  However,  such  models  also  provide  a  potentially  useful  platform 
for  the  identification  of  new  pathways  that  contribute  to  the  transformed  phenotype. 

In  this  award,  we  originally  proposed  two  basic  areas  of  investigation.  The  first  area  is 
the  development  of  methods  to  investigate  the  repertoire  of  the  immune  system  to 
determine  whether  auto-antibodies  exist  that  might  predict  the  onset  of  breast  cancer. 
The  second  area  was  the  construction  and  use  of  shRNA  libraries  to  find  genes  relevant 
to  breast  cancer  and  hopefully  targets  that  might  kill  tumor  cells. 

A  key  part  of  our  research  plan  has  been  the  development  and  use  of  retroviral  vectors 
expressing  RNA  interference  RNAs  to  identify  human  genes  involved  in  causing  or 
restraining  cancer.  In  our  first  progress  reports  we  described  our  efforts  to  develop 
shRNA  libraries  and  showed  they  could  be  used  to  identify  tumor  suppressors.  Ultimately 


our  goal  is  to  screen  of  complex  pools  of  shRNA  expressing  retroviruses  each  marked 
with  a  bar  code  that  allows  the  results  of  the  screen  to  be  read  out  by  microarray 
hybridization.  We  demonstrated  this  could  be  accomplished  in  enrichment  screens  for 
shRNAs  that  caused  cellular  transformation  and  growth  in  soft  agar  and  identified  the 
REST  gene  and  several  other  tumor  suppressors.  However,  a  key  goal  has  been  to 
identify  shRNAs  that  debilitate  or  kill  cancer  cells.  In  order  for  this  to  be  possible  in 
complex  pools,  it  is  imperative  that  each  vector  knock  down  its  target  with  high 
penetrance.  We  have  successfully  achieved  this  level  of  knockdown  and  can  now  see 
particular  shRNA  expressing  viruses  drop  out  of  complex  pools.  We  have  used  this 
methodology  to  search  for  genes  whose  knockdown  enhance  the  proliferative  capacity  of 
normal  breast  epithelial  cells,  i.e.  candidate  tumor  suppressors  and  genes  whose 
knockdown  are  cancer  specific  lethals. 

We  have  also  searched  for  genes  that  when  overproduced  are  capable  of  transforming 
human  mammary  epithelial  cells.  In  this  way  we  have  found  several  proteins  with 
oncogenic  potential,  several  of  which  are  amplified  in  breast  cancers.  We  have 
performed  a  biological  analysis  of  one  of  these  in  detail,  PVRL4. 

With  respect  to  our  goal  of  deconvoluting  the  auto-antibody  response  in  patients  with 
breast  cancer.  We  have  developed  a  representation  of  all  human  linear  peptides  and 
methodology  to  screen  them  with  breast  cancer  patient  sera  to  look  for  autoantibodies 
that  might  provide  a  biomarker  for  early  detection  of  breast  cancer. 

BODY 

Identification  of  cancer-relevant  genes. 

Development  and  exploitation  of  shRNA  libraries  to  identify  cancer  relevant  genes 
using  human  genetic  screening. 

Retroviral  shRNA-mediated  genetic  screens  in  mammalian  cells  are  powerful  tools  for 
discovering  loss-of-function  phenotypes.  We  have  been  working  on  the  generation  of 
shRNA  libraries  for  the  express  purpose  of  performing  screens  to  kill  cancer  cell.  During 
this  grant  we  developed  a  highly  parallel  multiplex  methodology  for  screening  large 
pools  of  shRNAs  using  half-hairpin  barcodes  for  microarray  deconvolution. 

We  worked  to  produce  the  generation  of  barcoded,  microRNA-based  shRNA  libraries 
targeting  the  entire  human  genome  that  can  be  expressed  efficiently  from  retroviral  or 
lentiviral  vectors  in  a  variety  of  cell  types  for  stable  gene  knockdown'8'9’.  These 
constructs  include  silencing  triggers  designed  to  mimic  a  natural  microRNA  primary 
transcript,  and  each  target  sequence  was  selected  on  the  basis  of  thermodynamic  criteria 
for  optimal  small  RNA  performance.  Biochemical  and  phenotypic  assays  indicated  that 
the  new  libraries  are  substantially  improved  over  our  first  generation  reagents.  We 
generated  a  sequence  verified  library  comprising  more  that  140,000  second  generation 
short  hairpin  RNA  expression  plasmids  covering  a  substantial  fraction  of  all  predicted 
genes  in  the  human  genome.  This  work  is  described  in  Ref  9  and  the  details  are  in  the 
published  paper,  which  is  included  in  the  appendix. 
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The  expression  of  short  hairpin  RNA  (shRNA)  from  polymerase  III  promoters  can  be 
encoded  in  transgenes  and  used  to  produce  small  interfering  RNAs  that  down-regulate 
specific  genes  and  that  was  the  expression  context  of  our  version  2  shRNA  libraries. 
However,  we  found  that  that  polymerase  II-transcribed  shRNAs  display  very  efficient 
knockdown  of  gene  expression  when  the  shRNA  is  embedded  in  a  microRNA  context. 
Importantly,  our  shRNA  expression  system  [called  PRIME  (potent  RNA  interference 
using  microRNA  expression)  vectors]  allows  for  the  multicistronic  cotranscription  of  a 
reporter  gene,  thereby  facilitating  the  tracking  of  shRNA  production  in  individual 
cells'31).  Based  on  this  system,  we  developed  a  series  of  lentiviral  vectors  that  display 
tetracycline-responsive  knockdown  of  gene  expression  at  single  copy.  The  high 
penetrance  of  these  vectors  will  facilitate  genomewide  loss-of-function  screens  and  is  an 
important  step  toward  using  bar-coding  strategies  to  follow  loss  of  specific  sequences  in 
complex  populations. 

We  also  developed  a  method  of  screening  complex  pools  of  shRNAs  using  barcodes 
coupled  with  microarray  deconvolution  to  take  advantage  of  the  highly  parallel  format, 
low  cost,  and  flexibility  in  assay  design  of  this  approach'9'101.  Although  barcodes  are  not 
essential  for  enrichment  screens  (positive  selection)'10'121,  they  are  critical  for  dropout 
screens  (negative  selection)  such  as  those  designed  to  identify  cell  lethal  or  drug  sensitive 
shRNAs  which  we  have  proposed  to  do  for  this  Innovator  award  .  Hairpins  that  are 
depleted  over  time  can  be  identified  through  the  competitive  hybridization  of  barcodes 
derived  from  the  shRNA  population  before  and  after  selection  to  a  microarray. 

We  initially  described  the  use  of  60-mer  barcodes  for  pool  deconvolution'9'101.  To  provide 
an  alternative  to  these  bar-codes  that  enables  a  more  rapid  construction  and  screening  of 
shRNA  libraries,  we  developed  a  methodology  called  half-hairpin  (HH)  barcoding  for 
deconvoluting  pooled  shRNAs'141.  We  took  advantage  of  the  large  19-nt  hairpin  loop  of 
our  mir30-based  platform  and  designed  a  PCR  strategy  that  amplifies  only  the  3 ’-half  of 
the  shRNA  stem.  Compared  to  using  full  hairpin  sequences  for  microarray 
hybridization'15'161,  HH  barcodes  entirely  eliminate  probe  self-annealing,  providing  the 
dynamic  range  necessary  for  pool-based  dropout  screens.  HH  barcode  signals  are  highly 
reproducible  in  replicate  PCRs  (/?=(). 973),  highly  specific  (0.5%  cross  reaction),  and 
display  reasonable  dynamic  range  in  mixing  experiments  where  sub-pool  inputs  are 
varied  in  competing  hybridization  experiments.  Taken  together,  these  results  indicate  HH 
barcodes  are  alternatives  to  the  60-mer  barcodes  originally  designed  into  our  library. 

We  have  also  made  improvements  over  both  the  60-mer  barcodes  and  the  half-hairpin 
barcodes.  Both  of  those  two  barcodes  suffer  from  cross  hybridization  and  the  occasional 
poor  hybridization  capacity  for  unknown  reasons.  We  developed  a  framework  for 
designing  large  sets  of  orthogonal  barcode  probes.  We  demonstrate  the  utility  of  this 
framework  by  designing  240,000  barcode  probes  and  testing  their  performance  by 
hybridization.  From  the  test  hybridizations,  we  also  discovered  new  probe  design  rules 
that  significantly  reduce  cross-hybridization  after  their  introduction  into  the  framework  of 
the  algorithm.  These  rules  should  improve  the  performance  of  DNA  microarray  probe 
designs  for  many  applications.  The  details  of  this  work  are  in  the  published  paper  Xu  et 
al,  2009  (Ref  17),  which  is  included  in  the  appendix.  We  have  recently  used  some  of 
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these  barcodes  and  they  do  seem  to  have  much  better  properties  than  the  half-hairpin 
barcodes. 

Using  these  libraries  and  screening  methods,  we  carried  out  dropout  screens  for  shRNAs 
that  affect  cell  proliferation  and  viability  in  cancer  and  normal  cells.  We  identified  many 
shRNAs  to  be  anti-proliferative  that  target  core  cellular  processes  such  as  the  cell  cycle 
and  protein  translation  in  all  cells  examined.  More  importantly,  we  identified  genes  that 
are  selectively  required  for  proliferation  and  survival  in  different  cell  lines.  Our  platform 
enables  rapid  and  cost-effective  genome-wide  screens  to  identify  cancer  proliferation  and 
survival  genes  for  target  discovery.  Such  efforts  are  complementary  to  the  Cancer 
Genome  Atlas  and  provide  an  alternative  functional  view  of  cancer  cells.  This  initial 
screening  technology  resulted  in  two  papers  in  Science,  Schlabach  et  al,  2008  (Ref  18) 
and  Silva  et  al,  2008  (Ref  19),  and  I  will  not  go  into  great  detail  here  with  respect  to  the 
data  because  both  manuscripts  are  included  in  the  appendix  of  this  report  and  they 
describe  the  relevant  finding.  The  bottom  line  is  that  these  methods  work,  they  are  robust 
and  we  can  use  them  to  find  cancer  relevant  genes.  I  also  describe  below  the 
identification  of  addition  genes  relevant  to  breast  cancer  cell  survival  using  this 
technology 

Control  of  REST  Degradation 

Below  I  will  describe  the  studies  on  the  REST  tumor  suppressor  protein.  Actual  data  will 
not  be  included  as  they  are  all  published  and  included  as  full  papers'20"22'. 

The  transcription  factor  REST/NSRF  (RE  1 -Silencing  Transcription  Factor)  is  a  master 
repressor  of  neuronal  gene  expression  and  neuronal  programs  in  non-neuronal  lineages'23" 
25).  In  the  course  of  this  grant,  we  identified  REST  as  a  human  tumor  suppressor  in  breast 
epithelial  tissues'20',  suggesting  that  REST  regulation  may  have  important  physiologic 
and  pathologic  consequences.  We  showed  that  REST  knockdown  either  using  shRNA 
libraries  or  using  a  dominant  negative  REST  mutant  that  exists  in  a  human  tumor,  could 
cause  human  mammary  epithelial  cells  to  acquire  tumorigenic  properties  such  as  growth 
in  soft  agar.  Many  pathways  controlling  REST  have  yet  to  be  elucidated.  We  went 
forward  to  further  study  this  problem  and  found  that  REST  is  actually  regulated  by 
ubiquitin-mediated  proteolysis'21'.  We  found  through  an  RNAi  screen  that  SCFpTRCP  is  an 
E3  ubiquitin  ligase  responsible  for  REST  degradation.  (3TRCP  binds  and  ubiquitinates 
REST  and  controls  its  stability  through  a  conserved  phosphodegron.  During  neural 
differentiation  and  in  breast  cells  REST  is  degraded  in  a  pTRCP-dependent  manner. 
|)TRCP  is  required  for  proper  neural  differentiation  only  in  the  presence  of  REST, 
indicating  that  (3TRCP  facilitates  this  process  through  degradation  of  REST.  Conversely, 
failure  to  degrade  REST  attenuates  differentiation.  Furthermore,  we  find  that  (3TRCP 
overexpression,  which  is  common  in  human  epithelial  cancers  including  breast,  causes 
oncogenic  transformation  of  human  mammary  epithelial  cells  and  this  pathogenic 
function  requires  REST  degradation.  Thus,  REST  is  a  key  target  in  |3TRCP-driven 
transformation  and  the  (3TRCP-REST  axis  is  a  new  regulatory  pathway  controlling 
neurogenesis. 
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The  data  we  generated  demonstrate  that  REST  is  a  labile  protein  targeted  for  ubiquitin- 
dependent  proteasomal  degradation  by  SCF|,trcp  through  a  phospho-degron  on  REST. 
We  showed  that  SCF|,trcp  is  a  critical  regulator  of  both  physiologic  and  pathologic  REST 
activities,  constituting  a  new  pathway  controlling  neural  differentiation  and  cellular 
transformation.  We  provided  the  first  genetic  evidence  that  REST  and  SCFpTRCP  regulate 
an  early  stage  in  neural  specification  as  an  inhibitor  of  neural  differentiation.  This  is 
likely  to  be  important  in  epithelial  cancers  as  forced  SCF|iTRtP  activation  is  known  to 
cause  breast  cancer  when  expressed  in  mouse  mammary  glands.  Our  data  are  consistent 
with  a  model  in  which  developmental  cues  induce  degradation  of  REST,  resulting  in  the 
derepression  of  REST  targets.  The  ability  of  stable  REST  to  inhibit  terminal 
differentiation  of  neurons  also  predicts  that  REST  may  promote  proliferative  properties  in 
the  neuronal  lineage  when  overproduced  or  inappropriately  stabilized.  Consistent  with 
this  notion,  REST  is  overexpressed  in  human  medulloblastoma  and  ectopic  REST 
expression  in  v-myc-immortalized  neural  stem  cells  promotes  medulloblastoma  formation 
in  mice  "  .  Thus,  the  contrasting  roles  of  REST  as  an  oncogene  and  tumor  suppressor 
are  highly  dependent  on  the  developmental  lineages. 

|)TRCP  is  overexpressed  and  oncogenic  in  epithelial  cancers<28'30)  and  we  identified 
REST  as  a  key  target  in  this  context.  This  suggests  that  pharmacologic  inhibition  of 
|)TRCP  may  provide  a  means  to  restore  REST  tumor  suppressor  function  in  human 
cancer.  The  presence  of  a  phosphodegron  motif  within  REST  suggests  a  role  for 
upstream  kinase(s)  and/or  phosphatase(s)  that  control  REST  degradation.  We  propose  a 
model  in  which  differentiation  into  the  neural  state  is  induced  by  this  yet  to  be  discovered 
signal  transduction  cascade  that  targets  REST  for  degradation  by  SCF|,trcp,  acting 
cooperatively  with  induction  of  (3TRCP  expression  during  neural  differentiation. 

Conversely,  hyperactivation  of  such  pathway(s)  priming  REST  degradation  may  be 
oncogenic  in  epithelial  tissues  and  thus  serve  as  new  therapeutic  targets  in  cancers  with 
compromised  REST  function.  Thus,  exploration  of  these  pathways  will  likely  provide 
new  opportunities  for  modulating  neural  stem  cell  and  cancer  cell  behavior.  We  need  to 
find  the  kinase  that  regulates  REST  as  it  will  be  a  potential  oncogene. 

We  also  participated  in  a  collaboration  with  Yang  Shi’s  lab  who  was  working  on 
methylatransferases.  They  found  a  protein  CDYL  that  also  binds  REST  and 
Methyltransferases  to  negatively  regulate  target  gene  expression  through  REST  binding 
to  promoters.  We  showed  with  them  that  CDYL  is  also  a  tumor  suppressor  candidate  and 
knocking  down  CDYL  expression  transforms  HMECs(22). 

The  data  obtained  form  these  studies  comprises  three  papers  that  are  included  in  the 
addendum  of  papers(20‘22\ 

Identification  of  novel  tumor  suppressor  genes 

As  we  have  described  previously,  we  have  developed  viral  shRNA  libraries  targeting  the 
entire  human  genome  to  explore  loss-of-function  phenotypes  in  mammalian  systems  and 
have  applied  these  libraries  towards  identification  of  novel  tumor  suppressor  genes.  With 
recent  improvements  in  our  library,  we  have  revisited  our  search  for  novel  tumor 
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suppressor  genes  in  breast  cancer  by  using  a  genome-wide  enrichment  screen  to  identify 
genes  whose  knockdown  increases  proliferation  and/or  survival  of  normal  human 
mammary  epithelial  cells  (HMECs). 

A  detailed  description  of  our  HMEC  enrichment  screen  is  illustrated  in  Figure  1.  Early 
passage  HMECs  used  for  the  screen  were  obtained  from  a  reduction  mammoplasty  and 
immortalized  by  expression  of  hTERT  and  spontaneous  silencing  of  pl6.  To  screen  the 
entire  genome-wide  library  of  -80,000  shRNAs,  screens  were  performed  in  6  separate 
pools  of  -13,000  shRNAs.  HMECs  were  infected  in  triplicate  with  a  representation  of 
1000  cells  per  shRNA  at  an  MOI  of  2  viruses  per  cell.  Initial  reference  samples  were 
collected  72  hours  post-infection.  The  remaining  cells  were  puromycin-selected  for  4 
days  and  propagated  with  a  representation  of  >1000  cells  per  shRNA  maintained  at  each 
passage.  Cells  were  collected  as  the  end  samples  after  -7  population  doublings  (PDs). 
Probes  were  prepared  from  both  samples,  and  Cy3-  or  Cy5-labeled  probes  were 
competitively  hybridized  to  half-hairpin  barcode  microarrays  to  measure  the  change  in 
representation  of  each  shRNA  over  time. 


Tumor  Suppressor  Screen  Design 


shRNA  oligonucleotides 
printed  on  microarray 


PCR-cloning  into 
plasmid  library 


Packaging  into 
retrovirus  pool 


Jill 


Cy5  labeled 


Infect  HMECs 
in  triplicate 
80.000  shRNAs 
6  pools 

Initial  samples 


Figure  1. 

Tumor  suppressor  screen 
design.  A  genome-wide 
enrichment  screen  was 
performed  in  human 
mammary  epithelial  cells 
for  shRNAs  against  genes 
whose  knockdown 

increases  proliferation  or 
cell  survival.  Such  genes 
are  novel  tumor 
suppressor  candidates 


Enrichment 


Competitive  hybridization 
to  microarray 


PCR  recovery  of  half-hairpin 
ampliconsfrom  genomic  DNA 


6-8 

Passages 


Cy3  labeled 


End  samples 


1 1 1 1 1 '  i 

i  "  1 1 1 1 


Statistical  analysis 

indicate  that  most 
probes  consistently 
yielded  signals  >2-fold 
above  the  mean 
background  of  negative 

control  probes  across  all  triplicates.  The  correlations  among  the  initial  samples  across 
triplicates  and  between  the  initial  and  end  samples  within  each  replica  were  high, 

indicating  high  reproducibility  and  maintenance  of  representation.  As  expected,  most 

shRNAs  showed  little  change  over  the  time  course  of  the  screen.  However,  using 
statistical  analysis  for  microarray  (SAM)  with  a  false  discovery  rate  (FDR)  of  5%,  we 
observed  that  the  abundance  of  4257  shRNAs  against  3653  genes  were  increased  >  2-fold 
over  the  course  of  the  screen,  indicating  these  shRNAs  increase  proliferation  and/or 
survival  of  HMECs.  Many  of  these  genes  show  statistically  significant  overlap  with 
deletion  regions  in  breast  cancers. 


Many  of  the  genes  whose  knockdown  lead  to  increased  proliferation  and/or  survival  are 
known  tumor  suppressor  genes  or  key  mediators  of  proliferation  and  apoptotic  pathways. 
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For  example,  shRNAs  against  genes  encoding  the  tumor  suppressors  Rb,  PTEN,  and  p53 
were  all  increased  over  the  course  of  the  screen.  Furthermore,  shRNAs  against 
proliferation  genes,  such  as  those  encoding  the  Rb-like  proteins  pl07  and  pl30  and  cyclin 
dependent  kinase  inhibitors  p21,  p27,  and  p57,  as  well  as  shRNAs  against  prop-apoptotic 
genes,  such  as  those  encoding  caspase  3,  caspase  6,  and  Apaf-1  scored  as  well. 

We  are  currently  using  a  multi-color  competition  assay  (MCA)  to  validate  shRNAs  which 
have  scored  in  our  screen  and  determine  whether  knockdown  of  candidate  genes 
increases  proliferation  and/or  survival.  Thus  far,  we  have  examined  142  candidate 
shRNAs  against  123  known  genes  that  scored  in  our  screen  for  their  effect  on  cell 
proliferation  and/or  survival.  Using  the  MCA  assay,  58  of  142  candidate  shRNAs  (40%) 
validated  to  increase  proliferation  and/or  survival  compared  to  FF2  control  shRNAs  over 
the  6  day  assay.  We  are  currently  investigating  these  genes  to  determine  the  mechanisms 
by  which  their  knockdown  increases  proliferation  and/or  survival  and  whether  they  are 
novel  tumor  suppressors.  Furthermore,  we  are  using  genomic  profiling  of  tumor  samples 
to  determine  whether  these  genes  are  located  in  focal  deletion  regions,  which  would 
suggest  that  they  are  involved  in  tumor  suppression. 

We  have  also  synthesized  a  new  library  of  shRNAs  that  has  a  much  deeper  representation 
of  shRNAs  for  our  candidate  genes,  12  hairpins  per  gene  and  have  screened  that  library  to 
look  for  sequences  enriched  after  growth  in  HMEC  cells.  This  will  give  us  the  ability  to 
say  whether  or  not  the  gene  of  interest  is  indeed  the  target  of  the  shRNAs  we  have 
identified.  If  we  find  multiple  shRNAs  to  a  single  gene,  that  will  mean  it  is  a  real  target  as 
opposed  to  an  off  target. 


Identification  of  breast  cancer-specific  lethal  genes 

Genetic  loss-of-function  screening  to  identify  genes  that  are  essential  to  cancer  cell 
proliferation  and  survival  is  a  powerful  and  complementary  approach  to  large  sequencing 
efforts  and  is  expected  to  provide  many  potential  cancer  drug  targets.  Towards  this  end, 
we  have  performed  lethality  screens  to  identify  genes  that  are  selectively  required  for 
proliferation  and  survival  of  breast  cancer  cells  but  not  normal  mammary  epithelial  cells, 
which  we  call  Breast  Cancer  Fethal  (BCAF)  genes.  For  lethality  screens,  the  abundance 
of  shRNAs  targeting  genes  that  are  essential  for  cell  viability  will  be  reduced  following 
cell  passaging  and  will  thus  “drop-out”  of  the  shRNA  population.  By  comparing  each 
shRNA’s  abundance  in  an  initial  cell  population  taken  shortly  after  retroviral  shRNA 
library  infection  to  its  abundance  in  samples  taken  after  several  cell  population  doublings, 
lethal  shRNAs  can  be  identified.  Additionally,  comparisons  between  the  shRNA  lethality 
profiles  of  breast  cancer  cells  and  normal  human  mammary  epithelial  cells  can  identify 
BCAF  genes  (Figure  2). 
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Genome-wide  Breast  Cancer  Lethal  Screens 


Retroviral  shRNA  library:  80,000  shRNAs,  6  pools 


(HCC1143) 

Cy5-lnitial  Cy3-End 
6-8  PDs 


Breast  cancer  cells 
(HCC1954) 
Cy5-lnitial  Cy3-End 
6-8  PDs 


Breast  cancer  cells 
(T-47D) 

Cy5-lnitial  Cy3-End 
6-8  PDs 


(HMECs) 

Cy5-lnitial  Cy3-End 
6-8  PDs 


Lethal 
Dropouts 
(Red  spots) 


Compare  lethal  hits  among  cell  lines 
Determine  breast  cancer-specific  and  subtype-specific  targets 


Figure  2. 

Breast  cancer  lethal  screen 
design.  A  genome-wide 
lethality  screen  was 
performed  in  three  breast 
cancer  cell  lines  and  one 
normal  human  mammary 
epithelial  cell  line  for 
shRNAs  against  genes  whose 
knockdown  decreases 

viability  of  breast  cancer  but 
not  normal  cells.  Such  genes 
are  novel  cancer  drug 
targets. 


To  identify  BCAL  genes,  we  have  performed  highly-parallel,  genome- wide,  pooled, 
shRNA  lethality  screens  in  three  breast  cancer  (HCC1954,  HCC1143,  T47D)  and  one 
normal  mammary  epithelial  (HMEC)  cell  lines.  Cells  were  infected  in  triplicate  with  the 
entire  genome- wide  library  of  -80,000  shRNAs  in  6  separate  pools  of  -13,000 
shRNAs/pool.  Cells  were  infected  in  triplicate  with  a  representation  of  1000  cells  per 
shRNA  at  an  MOI  of  2  viruses  per  cell.  Initial  reference  samples  were  collected  72  hours 
post-infection.  The  remaining  cells  were  puromycin- selected  for  4  days  and  propagated 
with  a  representation  of  >1000  cells  per  shRNA  maintained  at  each  passage.  Cells  were 
collected  as  the  end  samples  after  -7  population  doublings  (PDs).  Probes  were  prepared 
from  both  initial  and  end  samples,  and  Cy3-  or  Cy5-labeled  probes  were  competitively 
hybridized  to  half-hairpin  barcode  microarrays  to  measure  the  change  in  representation  of 
each  shRNA  over  time. 


Statistical  analysis  indicate  that  most  probes  consistently  yielded  signals  >2-fold  above 
the  mean  background  of  negative  control  probes  across  all  triplicates.  The  correlations 
among  the  initial  samples  across  triplicates  and  between  the  initial  and  end  samples 
within  each  replica  were  high,  indicating  high  reproducibility  and  maintenance  of 
representation.  To  identify  cancer- specific  lethal  shRNAs,  we  utilized  statistical  analysis 
for  microarray  (SAM)  with  a  false  discovery  rate  (FDR)  of  10%  as  well  as  several  fold 
change  cutoffs.  For  a  given  shRNA  to  score  as  a  BCAF  shRNA,  its  abundance  decreased 
>  2-fold  in  one  of  the  three  breast  cancer  cell  lines,  but  was  not  decreased  >  2-fold  in 
HMECs.  Furthermore,  a  BCAF  shRNA  displayed  >1.8  fold  difference  in  abundance 
between  the  normal  and  breast  cancer  cell  line.  We  identified  3787  shRNAs  against  3410 
genes  that  met  these  BCAF  criteria.  These  shRNAs  and  genes  were  further  classified 
into  4  groups:  “pan”  BCAF  shRNAs  that  were  selectively  lethal  to  2  or  more  breast 
cancer  cell  lines,  and  HCC1954,  HCC1143,  and  T47D  cell  type-specific  BCAF  shRNAs 
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that  were  selectively  lethal  to  a  single  breast  cancer  cell  line.  We  are  currently  validating 
whether  knockdown  of  candidate  BCAL  genes  leads  to  reduced  viability  of  breast  cancer 
cell  lines  but  not  normal  HMECs  using  a  Cell  Titer  Glo  viability  assay.  Furthermore,  we 
are  investigating  these  candidates  using  mechanistic  studies  to  determine  how  their 
reduction  leads  to  cancer  cell  lethality.  We  expect  these  BCAL  genes  will  reveal  novel 
oncogene  or  non-oncogene  addictions  of  breast  cancer  cells  that  can  lead  to  new 
therapeutic  targets  for  breast  cancer. 


Identification  of  synthetic  lethal  genes  with  K-ras 

A  major  challenge  in  cancer  therapeutics  is  the  identification  of  cellular  drug  targets 
whose  inhibition  leads  to  the  selective  killing  of  cancer  cells  while  sparing  normal  cells. 
Recent  advances  in  mammalian  RNA  interference  (RNAi)  technologies  have  made  it 
possible  to  systematically  interrogate  the  human  genome  for  genes  whose  loss  of  function 
constitute  synthetic  lethality  either  with  the  oncogenic  state  or  with  particular  oncogenic 
mutations'13, 18, 19).  We  have  developed  barcoded,  retroviral/lentiviral-based  short  hairpin 
RNA  (shRNA)  libraries  targeting  the  entire  human  genome  to  enable  genome-wide  loss- 
of-function  analysis  through  stable  gene  knockdown(9).  Our  design  also  allowed  us  to 
develop  a  multiplex  screening  platform  that  enables  the  highly  parallel  screening  of 
>10,000  shRNAs  in  a  pool-based  format  using  microarray  deconvolution'18,19’.  These 
technological  breakthroughs  have  made  it  possible  to  rapidly  interrogate  the  genome  for 
functional  vulnerability  of  cancer  cells  and  here  we  apply  these  to  the  Ras  oncogene. 

The  Ras  family  of  small  GTPases  are  frequently  mutated  in  human  cancers  [Reviewed  in 
Ref  34],  Ras  is  a  membrane-bound  signaling  molecule  that  cycles  between  the  inactive, 
GDP-bound  state  and  the  active,  GTP-bound  state.  Growth  factor  receptor  signaling 
promotes  GTP  loading  and  activation  of  Ras,  which  in  turn  activates  an  array  of 
downstream  pathways  to  promote  cell  proliferation  and  survival.  Among  the  major  Ras 
effector  pathways  is  the  MAP  kinase  pathway,  the  PI3-kinase  (PI3K)  pathway,  RalGDS 
proteins,  phospholipase-Ce  and  Rac.  Each  of  these  has  been  implicated  in  mediation  of 
Ras  oncogenesis.  Ras  GAPs  (GTPase  activating  proteins)  inactivate  Ras  by  stimulating 
GTP  hydrolysis.  Oncogenic  mutations  in  Ras  are  invariably  point  mutations  that  either 
interfere  with  Ras  GAP  binding  or  directly  disrupt  Ras  GTPase  activity,  locking  Ras  in  a 
constitutively  active,  GTP-bound  state.  Oncogenic  mutations  have  been  found  in  all  three 
members  of  the  Ras  gene  family  with  KRAS  being  the  most  frequently  mutated.  KRAS 
mutations  are  found  at  high  frequencies  in  pancreatic,  thyroid,  colon,  lung  and  liver 
cancers  and  in  myelodyspastic  syndrome  and  are  correlated  with  poor  prognosis'34’. 

Despite  its  prominent  status  as  a  cancer  drug  target,  therapeutics  aimed  at  disrupting  the 
Ras  pathway  have  proven  challenging  thus  far.  Inhibitors  of  farnesyl  transferase,  the 
enzyme  that  prenylates  Ras  for  its  membrane  localization,  have  met  with  only  limited 
success'34’.  Chemical  screens  in  isogenic  Ras  mutant  and  wild  type  cell  lines  have 
identified  compounds  that  exhibit  preferential  toxicity  towards  Ras  mutant  cells'35'36’. 
However,  the  translation  of  these  chemical  screens  into  clinical  practice  has  been 
impeded  by  the  challenge  in  identifying  the  protein  targets  of  these  chemical  entities  and 
subsequent  drug  development.  Inhibitors  targeting  various  Ras  effecter  pathways  could 
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also  prove  efficacious  in  treating  tumors  with  Ras  mutations,  as  it  was  recently  shown 
that  a  combined  application  of  MEK  and  PI3K/mTOR  inhibitors  can  reduce  tumor 
burden  in  a  mouse  model  of  Ras-driven  lung  cancer  .  However,  the  prevalence  of  de 
novo  and  acquired  drug  resistance  to  other  targeted  therapies  suggests  that  combinations 
of  multiple  therapeutic  agents  will  be  required  to  effectively  inhibit  malignant 
progression. 

In  principle,  tumors  can  be  attacked  by  either  reversing  the  effects  of  oncoproteins 
through  inhibition  (i.e.  exploiting  oncogene  addiction),  or  by  attacking  tumor- specific 
vulnerabilities  caused  by  the  oncogenic  state,  often  by  inhibiting  proteins  that  are  not 
oncoproteins  themselves  (i.e.  exploiting  non-oncogene  addiction)132,331.  The  inappropriate 
rewiring  of  cellular  signaling  through  oncogene  activation  should  result  in  vulnerabilities 
that  could  be  exploited  for  cancer  therapies  in  theory.  Since  these  vulnerabilities  are  not 
obvious  and  cannot  be  predicted,  the  most  direct  approach  to  their  discovery  is  through 
genetic  exploration.  The  systematic  identification  of  genes  and  pathways  necessary  for 
the  Ras-driven  oncogenic  state  would  provide  additional  drug  targets  for  therapeutic 
exploration,  shed  new  light  on  Ras’  mechanisms  of  action  and  potentially  provide  new 
biomarkers  for  patient  stratification.  To  this  end,  we  undertook  a  genome- wide  RNAi 
screen  to  identify  synthetic  lethal  interactions  with  the  KRAS  oncogene.  We  discovered  a 
diverse  set  of  proteins  whose  depletion  selectively  impaired  the  viability  of  Ras  mutant 
cells.  Among  these  we  observed  a  strong  enrichment  for  genes  with  mitotic  functions. 
We  found  a  pathway  involving  the  mitotic  kinase  PLK1,  the  anaphase  promoting 
complex/cyclosome  and  the  proteasome  that,  when  inhibited,  results  in  prometaphase 
accumulation  and  the  subsequent  death  of  Ras  mutant  cells.  Gene  expression  analysis 
indicates  that  reduced  expression  of  genes  in  this  pathway  correlates  with  increased 
survival  of  patients  bearing  tumors  with  a  Ras  transcriptional  signature.  Our  results 
suggest  a  previously  underappreciated  role  for  Ras  in  mitotic  progression  and 
demonstrate  a  pharmacologically  tractable  pathway  for  the  potential  treatment  of  cancers 
harboring  Ras  mutations. 

We  have  found  that  the  synthetic  lethal  approach  to  be  very  informative  with  respect  to 
identifying  potential  targets  for  anti-cancer  therapeutics  that  can  be  used  in  combination 
with  other  drugs  to  attack  a  cancer  of  a  particular  genotype  with  particular  oncogenic  or 
tumor  suppressor  mutations.  We  hope  to  continue  these  sorts  of  screens  in  the  future  to 
identify  genes  whose  depletions  are  toxic  in  different  cancers.  The  experimental  details  of 
this  work  are  included  in  the  Luo  et  al  Cell  paper  in  the  appendix  . 


Anti-tumor  Antibody  profiling 

Anti-tumor  auto-antibodies  have  been  proposed  to  be  highly  sensitive  and  specific 
biomarkers  for  early  cancer  detection.  To  identify  antibody  binding  profiles  specific  for 
cancer  patients,  a  number  of  groups  have  utilized  phage  display,  a  high-  throughput 
affinity  assay.  In  this  approach,  libraries  of  peptides  or  protein  fragments  are  displayed  on 
the  surface  of  a  bacteriophage,  thereby  allowing  a  proteomic  screen  for  binding 
properties  of  a  patient’s  antibody  repertoire.  Previous  efforts  towards  auto-antibody 
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biomarker  discovery  using  phage  display  have  used  either  random  peptides  or  tumor- 
derived  cDNA  libraries  to  “pan”  displayed  peptides  against  patient  sera.  Such  studies 
have  been  difficult  to  interpret,  however,  since  resulting  “hits”  are  frequently  out  of 
frame  or  derived  from  noncoding  sequences. 

Our  novel  strategy  has  been  to  encode  the  human  “peptidome”  as  a  set  of  individual 
DNA  microarray-derived  oligonucleotides.  We  have  generated  a  library  of  approximately 
467,000  oligos  that  tile  the  entire  set  of  human  open  reading  frames.  For  phage  display, 
the  oligos  were  cloned  into  the  T7Select  (Novagen)  system,  allowing  for  low  copy 
number  display  fused  to  the  T7  coat  protein,  10B.  The  library  was  then  extensively 
characterized  by  sequencing  several  hundred  individual  phage  clones  at  random.  As  we 
reported  last  year,  74%  of  our  phage  population  encodes  in-frame  peptides,  and  55%  of 
the  population  expresses  completely  correct  sequences.  This  phage  library  is  unique  in 
several  respects.  First,  this  is  the  first  example  of  a  synthetic  phage  display  library 
encoding  protein  fragments  from  microarray-derived  oligonucleotides.  Second,  this 
library  is  the  only  example  of  a  normalized  representation  of  the  human  peptidome.  The 
alternative  random  peptide  and  cDNA  derived  libraries  are  much  less  powerful  from  a 


Figure  3. 

Novel  approach  to  patient 
autoantibody  profiling.  T7- 
Peptidome  library  is  mixed  with 
patient  antibodies  and  specific 
complexes  are  allowed  to  form. 
Complexes  are  purified  on 
magnetic  protein  A/G  beads. 
High  throughput  sequencing  is 
utilized  to  identify  captured 
phage,  and  profiles  are 
compared  between  patients. 


cancer  biomarker  screening  standpoint. 


Autoantibody  Profiling  Strategy 


T7  phage:  library  of  33  amino 
add  peptides  that  tile  human 
ORFeome  (4.67x10s  members) 


Incubate  T7  ORF  library  with  patient 
serum  containing  autoantibodies 


Purify  phage-antibody  complexes 
on  protein  A/G  magnetic  beads 


Identity  of  captured  phage  population 
determined  using  Hlumina's  high 
throughput  sequencing  technology. 

Compare  sequences  between  cancer  patients  and  controls. 
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The  sensitivity  of  our  screen  for  auto-antibodies  associated  with  breast  cancer  can  be 
determined  by  the  degree  of  specific  enrichment  that  we  are  able  to  obtain  by 
immunoprecipitating  antibody-bound  phage.  To  this  end,  we  have  optimized  the 
conditions  for  enrichment  since  our  last  report.  By  diluting  a  FLAG-tagged  T7  phage 
(1:1000)  into  a  native  phage  population  and  diluting  an  anti-FLAG  antibody  (1:1000)  into 
a  non-specific  isotype  control  antibody,  we  were  able  to  mimic  the  high  complexity  of 
our  assay  with  just  four  variables.  We  optimized  for  enrichment  (measured  quantitatively 
with  the  plaque  lift  assay)  by  systematically  varying  the  following  parameters:  T7  phage 
concentration,  antibody  concentration,  time  of  immunoprecipitation,  and  number  of 
washes  before  phage  elution.  The  concentrations  of  phage  and  antibody  during  complex 
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formation  were  assumed  to  be  dependent  on  each  other,  and  were  thus  optimized 
simultaneously  (Figure  4).  We  successfully  optimized  all  4  parameters,  allowing  for  a 
FLAG  phage  enrichment  factor  of  over  4000-fold  showing  the  basic  premise  of  the 
screen  is  sound  and  we  should  be  able  to  screen  breast  serum  samples. 
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Figure  4, 

Optimization  of  enrichment  with 
respect  to  phage  and  serum 
antibody  concentration.  Total 
phage  and  serum  antibody  were 
mixed  at  concentrations  indicated 
and  let  to  form  complexes 
overnight  at  4°C.  Enrichment  was 
determined  by  plaque  lift  assay  for 
FLAG  expressing  phage. 
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After  optimizing  enrichment 
conditions,  we  performed  our 
optimized  immuno- 

precipitation  assay  on  our 
synthetic  phage  library  to 
100E+n  validate  our  enrichment 

phage  pfu/mi  protocol.  To  this  end,  we 

immunoprecipitated  a  pool  of  our  library  with  a  combination  of  9  antibodies,  all  of  which 
are  directed  against  the  C-termini  of  different  human  proteins.  This  pool  consisted  of 
27,000  unique  phage  displaying  the  C-terminal  peptidome.  Finally,  a  plaque  lift  of  the 
immunoprecipitated  phage  were  immunoblotted  with  the  same  mix  of  C-terminal 
antibodies  (Figure  5).  Whereas  there  were  apparently  no  “hits”  on  a  plaque  lift  derived 
from  the  input  library,  we  observed  a  large  fraction  of  positively  staining  plaques  on  the 
immunoprecipitated  sample.  Of  the  thirty  hits  that  were  sequenced,  all  of  these 
corresponded  to  peptides  specifically  targeted  by  the  C-terminally-directed  input 
antibodies,  suggesting  that  our  optimized  enrichment  protocol  is  highly  selective  for 
specific  interactions. 


Figure  5. 

Plaque  lifts  of  the  C-terminal 
peptidome  libraries,  stained 
with  a  pool  of  9  anti-C-terminal 
protein  antibodies.  Blot  on  the  left 
is  from  the  input  library,  and  blot 
on  the  right  is  from  the  phage 
immunoprecipitated  with  the 
same  pool  of  antibodies. 
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As  reported  last  year,  we  had  experienced  difficulty  using  microarrays  to  uniquely 
identify  the  phage  species  present  in  a  given  sample.  This  was  largely  due  to  the  fact  that 
our  probe  design  was  highly  constrained  by  the  sequences  of  the  oligonucleotide  inserts. 
A  high  degree  of  cross-hybridization  was  observed,  which  compromised  the  quality  of 
our  data.  The  alternative,  high  throughput  sequencing  of  the  oligo  inserts,  is  expensive 
and  therefore  prohibitive  for  a  screen  of  >100  samples.  We  have  therefore  developed  an 
indexing  scheme  to  allow  multiplexing  of  samples  during  high  throughput  sequencing, 
thereby  reducing  the  cost  per  sample  (Figure  6). 


A. 

Sequencing 
Adapter  1  , 


B. 


Sample  Index 
mismatch 


1 


/N — ^  Library 


Insert 


■—  T7  Phage  genome 

'V  Sequencing 
Adapter  2 


Sample  Index 


Sequencing 
I  Primer 


Library  Insert 


Figure  6. 

Indexing  and  sequencing 
strategy  for  the  library 
inserts.  A.  Two  primers  are 
used  to  amplify  the  library 
insert  from  the  phage 
genome.  Each  primer 
includes  the  appropriate 
adapter  for  bridge  PCR. 
Additionally,  one  primer 
contains  a  mismatch  at  one 
of  three  bases,  which  will  be 
used  to  uniquely  identify  the 
sample  within  a  multiplexed 
pool.  B.  Sequencing  of  the 
index  and  library  insert  in 
the  Solexa  flow  cell. 


We  performed  Solexa  sequencing  on  the  T7  C-terminal  peptidome  library.  After 
alignment  of  the  8.4  million  reads  which  passed  QC,  we  noted  that  about  94%  of  all 
22,454  expected  sequences  were  indeed  observed.  Next,  after  performing 
immunoprecipitation  using  healthy  control  serum,  90%  of  the  input  phage  population 
remained,  suggesting  that  the  IP  did  not  significantly  bottleneck  the  population.  As  an  IP 
control,  we  spiked  in  three  of  the  C-terminal  directed  antibodies  (2  ng/ml)  that  had  been 
validated  previously.  The  three  target  peptides  corresponding  to  these  antibodies  were 
among  the  most  strongly  enriched  sequences  in  the  dataset. 
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Log  Abundance  Per  Detected  Clone 
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Log(Reads) 
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Figure  7.  Sequencing 
of  the  C -terminal 
peptidome  phage 
population.  A-B. 

Distribution  of  the 
sequence  abundance 
for  each  clone  in  the 
C-terminal  library.  C. 
Scatter  plot  of 

sequence  abundance 
for  each  clone, 
comparing  input 

population  with 

population  after  IP 
from  healthy  serum. 
High  abundance  T7 
controls  are  shown  in 
blue.  Targets  of  the 
three  spiked  in 
antibodies  (ATR, 
nibrin,  SAPK4)  are 
shown  in  red. 


In  order  to  assess  the  accuracy  of  the  sequence-based  enrichment  measurement,  we 
performed  sequencing  and  a  plaque  lift  on  the  same  immuno-precipitated  material.  By 
plaque  lift  assay  we  observed  that  about  1  of  every  35  phage  in  the  population  was 
SAPK4.  By  sequencing  the  same  sample,  1  of  every  25  sequences  aligned  with  the 
SAPK4  clone  (Figure  7). 

The  C-terminal  peptidome  IP  was  then  performed  using  3  pathological  sera.  Sera  from 
one  breast  cancer  (BC)  patient,  one  confirmed  paraneoplastic  disease  (PND)  patient,  and 
one  multiple  sclerosis  (MS)  patient  were  used  in  the  screen.  SAPK4  antibody  had  been 
spiked  in  at  1:1000  (relative  to  patient  antibody)  and  this  clone  was  strongly  enriched  in 
all  of  the  samples.  We  did  not  see  any  obvious  correlation  between  enriched  clones  and 
the  particular  pathology,  but  this  is  not  surprising  since  the  screen  utilized  only  C- 
terminal  peptides,  and  not  the  full  proteome,  which  is  the  next  step.  This  dataset  was  used 
to  model  the  effect  of  screening  against  the  more  complex  (-19  fold)  full  peptidome 
library,  as  well  as  the  effect  of  multiplexing  samples  (by  8  fold)  during  a  large-scale 
screen. 
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KEY  RESEARCH  ACCOMPLISHMENTS 


1.  Generation  of  genome  wide  oligos  that  cover  the  entire  coding  capacity  of  the 
human  genome. 

2.  Generation  of  genome  wide  phage  display  libraries  that  cover  the  entire  coding 
capacity  of  the  human  genome  for  the  auto-antibody  profiling. 

3.  The  determination  that  IPs  can  be  performed  on  these  phage  display  libraries  and 
enrich  target  phage. 

4.  Development  of  a  deconvolution  strategy  that  allows  multiplex  sequencing  of 
enriched  phage  for  deconvolution  through  next  generation  sequencing  methods. 

5.  Generation  of  a  shRNA  version  2  library  in  a  high  knockdown  MSCV  vector. 

6.  Discovery  that  PolII  promoters  with  a  spacer  give  single  copy  knockdown 

through  the  PRIME  Lenti  vectors. 

7.  Screening  of  this  library  to  identify  more  genes  that  repress  transformation  in 
mammary  epithelial  cells. 

8.  Identification  of  several  genes  including  REST  among  10  others  that  can  suppress 
transformation  of  mammary  cells.  This  was  using  our  version  2  library  and  we 
are  only  part  way  through  this  screen. 

9.  Identification  of  BetaTRCP  as  the  ubiquitin  ligase  that  controls  REST  stability. 

10.  Identification  of  CDYL  as  a  bridge  between  REST  and  methyl  transferases  to 

repress  gene  expression  and  determination  that  CDYL  suppresses  cellular 
transformation  in  breast  cells. 

11.  Identification  of  several  genes  that  when  overexpressed  cause  transformation  of 
mammary  cells. 

12.  Demonstration  that  our  new  version  2  libraries  are  capable  of  being  screened  in 
large  pools  to  identify  hairpins  that  are  toxic  to  cancer  and  normal  cells.  This  is  a 
key  finding  essential  to  accomplishing  the  goals  of  this  grant. 

13.  Development  of  a  new  barcoding  method,  the  half-hairpin  screening  method  that 
works  with  our  version  2  libraries  that  gives  us  an  ability  to  quickly  screen 
libraries  before  their  60-mer  barcodes  are  sequenced. 

14.  Identification  of  genes  that  are  selectively  toxic  to  cancer  cells. 

15.  Identification  of  several  genes  that  appear  to  be  synthetically  lethal  with  ras. 

16.  Pharmacological  validation  of  Plkl  as  a  synthetic  lethal  with  Kras. 

17.  Demonstration  that  we  can  perform  genome-wide  shRNA  screens  on  breast  lines. 

18.  Development  of  a  library  of  250,000  new  orthogonal  barcodes  of  25mers  that  do 
not  cross  hybridize  to  faithfully  report  on  relative  abundance  of  shRNA  vectors.. 

19.  Identification  of  genes  that  are  selectively  toxic  to  multiple  breast  cancer  cell 
lines. 
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With  respect  to  the  Statement  of  Work,  we  have  accomplished  many  of  the  goals  of  years 
1,  2,  3,  4  and  5  shown  below. 


Statement  of  Work. 

Year  1. 

Task  1  (Months  1-12) 

In  the  first  year  we  anticipate  beginning  to  work  out  the  conditions  for  using  the  bar 
coding  method  to  follow  retroviruses  containing  hairpins  as  mixtures  in  complex 
libraries.  We  now  have  a  library  of  22,000  hairpins  covering  about  8,000  genes.  We 
will  be  performing  exploratory  screens  and  optimizations  to  test  the  quality  of  the 
barcoding  method.  We  must  have  this  method  working  well  to  carryout  the 
synthetic  lethal  screens. 

We  accomplished  this  goal  in  two  ways.  The  first  is  we  performed  a  bar  code  screen  for 
potential  tumor  suppressors  and  identified  several  genes  described  in  our  first  report  and 
in  Westbrook  et  al,  2005).  Secondly,  we  have  improved  our  vectors  to  allow  single  copy 
knockdown  as  described  in  Stegmeier  et  al.  2005.  This  was  absolutely  essential  for  the 
bar  coding  experiments  we  have  proposed  to  kill  cancer  cells. 

Task  2  (Months  1-24  and  possibly  longer,  an  ongoing  effort) 

We  will  continue  to  expand  the  library  during  this  period  to  encompass  more  genes. 
This  will  be  done  in  collaboration  with  Dr.  Greg  Hannon. 

We  have  accomplished  this  goal  by  the  generation  of  a  second  generation  library  in  the 
mir30  context  as  described  in  Silva  et  al.,  2005.  This  covers  140,00  human  and  mouse 
shRNAs  as  was  described  in  last  years  report.  We  have  also  developed  new  and  better 
knockdown  vectors  to  allow  us  to  knock  down  genes  with  greater  penetrance.  Right  now 
we  feel  we  have  nearly  genome-wide  coverage  and  are  working  on  a  new  library  which  if 
successful  will  be  a  much  better  and  more  trustworthy  library. 

Task  3  (Months  6-24) 

We  also  will  begin  the  process  of  analyzing  the  human  genome  for  coding  sequences 
to  set  up  the  bio-informatics  analysis  to  generate  a  list  of  sequences  we  wish  to 
express  to  look  for  auto-antibodies.  We  should  begin  synthesizing  oligo  nucleotides 
to  cover  human  genes. 

We  have  designed  oligonucleotides  to  cover  the  human  genome.  We  are  through  cloning 
them  in  phage  display  vectors.  We  are  characterizing  the  libraries  and  trying  to  figure 
out  how  best  to  screen  them.  We  ran  into  the  problem  that  screening  them  by  microarray 
ran  into  cross  hybridization  problems  which  we  are  addressing  bioinformatically. 
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Year  2. 


Task  4  (Months  13-24) 

In  this  period  we  plan  to  begin  to  carryout  screen  to  look  for  genes  which  when 
knocked  down  by  shRNA  will  interfere  with  the  growth  of  cells  containing  defined 
mutations  that  lead  to  breast  cancer.  We  will  start  with  known  tumor  suppressors 
such  as  loss  of  p53  and  Rb.  We  will  use  the  barcoding  methods.  We  may  also  screen 
for  genes  that  sensitize  cells  to  killing  by  gamma  IR. 

We  initially  tried  PTEN  mutants  but  were  unable  to  find  synthetic  lethals.  We  have  now 
successfully  started  with  Kras  and  identified  a  few  reproducible  genes  in  a  pilot 
experiment  that  are  selectively  toxic  with  Kras  mutant  cells. 

Task  5  (Months  18-36) 

We  will  begin  to  synthesize  shRNA  clones  corresponding  to  the  mouse  genome. 

We  have  completed  this. 

Task  6  (Months  12-24) 

We  will  expand  the  library  of  short  coding  regions  for  the  autoantibody  project  and 
work  out  conditions  to  express  these  protein  fragments  in  bacteria  in  a  high 
through-put  fashion. 

We  have  made  the  libraries  and  are  working  on  developing  methods  to  analyze  the 
results. 


Year  3. 

Task  7  (Months  24-36) 

We  will  continue  to  screen  for  synthetic  lethals  with  tumor  causing  mutations 
relevant  to  breast  cancer.  In  addition,  by  this  time  we  will  be  retesting  the  synthetic 
lethal  positives  from  the  initial  screens  performed  in  year  two. 

We  performed  straight  lethal  and  synthetic  lethal  experiments  with  ras.  We  have  carried 
out  one  screen  and  now  hope  to  examine  out  hits  in  breast  lines  with  active  and  inactive 
ras. 

Task  8  (Months  24-36) 

We  plan  to  work  out  the  conditions  for  placing  the  proteins  expressing  short 
segments  of  human  proteins  for  the  auto-antibody  screening  project  onto  glass  slides 
for  screening  purposes. 

We  have  abandoned  this  aim  in  that  we  switched  our  approach  to  a  phage  display  library 
which  does  not  require  glass  slide.  We  made  our  first  comprehensive  library  in  a  T7 
display  vector. 
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Task  9  (Months  24-36) 

We  will  continue  to  characterize  the  mouse  shRNA  library. 

We  are  characterizing  the  mouse  library.  It  was  transferred  into  our  best  knockdown 
vector.  We  are  in  the  process  of  performing  a  screen  in  stem  cells  to  look  for  genes  that 
enhance  ionizing  radiation  resistance  or  sensitivity. 


Years  4  and  5. 

These  years  are  listed  together  as  they  will  be  consumed  with  executing  the  long-term 
goals  of  the  Tasks  outlined  in  years  1  through  3. 

Task  10  (Months  36-60) 

We  will  begin  to  screen  human  sera  for  autoantibodies  against  our  arrays  of  human 
protein  fragments.  We  will  work  out  these  methods  and  attempt  to  begin  a  higher 
through-put  analysis  to  determine  if  common  epitopes  are  eliciting  a  response  in 
breast  cancer  patients. 

We  have  made  the  libraries  and  obtained  the  sera  samples.  In  this  last  year  we  have 
established  that  the  methods  should  work  with  reconstruction  experiments  and  are 
working  out  ways  to  screen  the  data  generated.  We  cannot  use  microarray  readouts 
because  of  cross  hybridization.  We  have  solved  this  by  sequencing  using  Next  Generation 
sequencing  using  the  Solexa  platform.  We  are  in  the  process  of  petforming  the  initial  IPs 
to  characterize  the  immunome  ’s  interaction  with  the  human  peptidome. 


Task  11  (Months  36-60) 

We  will  infect  mice  with  retroviral  libraries  and  screen  for  tumor  suppressors  in  the 
breast  and  possibly  other  tissues. 

We  did  not  get  to  the  point  where  we  could  do  this  cam  as  we  are  consumed  with  finding 
the  cancer  the  lethals. 

Task  12  (Months  36-60) 

We  will  be  examining  the  genes  we  have  found  in  various  screens  using  standard 
molecular  biological  approaches  to  understand  their  roles  in  control  of  the 
responses  we  screened  for  in  previous  tasks. 

We  are  doing  this  with  some  of  our  tumor  suppressor  hits  and  some  potential  oncogenes 
we  have  found.  We  are  following  up  on  the  cancer- specific  lethals  as  well  as  ras  synthetic 
lethals.  The  ras  synthetic  lethals  appear  to  be  falling  into  a  pathway  that  reveals  that  ras 
is  sensitive  to  mitotic  perturbation. 
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CONCLUSIONS 


Progress  on  understanding  the  REST  Tumor  suppressor  pathway. 

It  is  clear  that  REST  acts  as  a  tumor  suppressor  in  mammary  cells.  That  means  that  the 
pathways  controlling  REST  are  also  likely  to  be  important  in  tumor  suppression  as  well. 
We  have  discovered  a  new  protein  kinase  driven  pathway  that  targets  REST  for 
proteolytic  degradation  via  the  SCF  using  the  F-box  protein  BTRCP.  Since  BTRCP 
overproduction  is  oncogeneic  and  causes  cellular  transformation  through  REST 
degradation,  we  have  now  established  a  cellular  transformation  pathway  that  links  a 
known  oncogene  to  the  negative  control  of  a  tumor  suppressor  gene. 

Progress  on  barcode  screening  for  essential  genes. 

It  is  clear  from  our  current  studies  that  we  have  overcome  the  main  problem  with 
performing  bar  code  screens  which  is  getting  sufficiently  good  knockdown  from  single 
copy  vectors  and  being  able  to  reproducibly  measure  their  abundance  in  complex  pools 
by  microarray  hybridization.  We  have  now  gone  most  way  through  a  genome  screen  for 
cancer-specific  lethals  in  three  genetically  distinct  breast  cancer  lines.  This  has  been  our 
major  goal  from  the  very  start  and  we  are  now  verifying  the  results  and  determining 
which  are  truly  cancer  specific.  In  addition,  we  have  begun  a  screen  to  find  genes  that  are 
synthetically  lethal  with  Kras  mutations.  We  hope  to  start  cMyc  and  PI3K  synthetic 
lethals  in  the  next  year.  This  should  tell  us  which  mutations  in  the  breast  lines  are 
causing  the  synthetic  lethality  we  are  seeing. 

Screens  for  Tumor  suppressors  using  the  RNAi  library  and  for  Oncogenes  using  the 
ORFeome  library. 

We  originally  thought  last  year  that  this  ongoing  effort  should  be  completed  this  year. 
However,  we  ran  into  trouble  with  our  cell  transformation  assay.  Apparently  the  supplier 
of  our  specialized  media  for  HMECs  switched  some  of  their  components  and  nothing 
worked.  We  worked  hard  for  6  months  and  thought  we  had  finally  overcome  that 
problem  which  is  important  for  both  the  shRNA  screens  as  well  as  the  overproduction 
screens  for  oncogenes  but  we  were  wrong  and  still  have  a  problem.  We  think  we  may  be 
back  on  track  but  it  will  be  a  while.  But  we  have  now  constructed  new  shRNA  libraries 
that  have  12  hairpins  per  gene,  which  we  will  screen  using  HMECs  if  possible. 

In  addition  we  are  performing  the  same  screens  with  retroviral  ORFeome  libraries,  which 
are  the  equivalent  to  normalized  full-length  cDNA  libraries.  One  gene  we  are  following 
up  is  PVRL4/Nectin-4.  It  potently  transforms  HMECs  and  is  overproduced  in  62%  of 
ductal  carcinomas.  We  have  found  several  breast  lines  that  express  PVRL4  and  when  it 
is  knocked  down  the  cells  show  reduced  proliferation  and  no  longer  grow  in  clumps  like 
stem  cells.  We  hope  to  have  a  paper  on  this  next  year. 
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We  now  have  the  libraries  in  hand  and  the  patient  samples.  We  ran  into  read-out 
problems  on  the  microarrays  because  we  are  forced  to  use  the  sequences  that  are  coded  as 
opposed  to  optimized  barcodes.  We  have  solved  this  problem  by  converting  to  DNA 
sequencing  which,  unfortunately  is  much  more  expensive  but  we  have  shown  that  it  can 
work  in  reconstruction  experiments.  We  are  hoping  that  this  will  be  done  in  the  next 
year. 

So  what. 

I  feel  this  work  has  generated  not  only  the  discovery  of  many  genes  on  which  people  will 
build  careers  studying  their  roles  in  breast  cancer,  but  also  a  set  of  tools  and 
methodologies  that  have  already  transformed  the  way  we  and  others  are  approaching  the 
cancer  problem.  These  methods  and  reagents  are  readily  available  to  the  scientific 
community.  We  have  developed  the  theory  of  non-oncogene  addiction132"331  that 
demonstrates  that  cancer  cells  are  addicted  to  genes  that  are  not  themselves  oncogenes  or 
tumor  suppressors  and  are  not  going  to  be  found  in  cancer  genome  sequencing  efforts. 
This  is  a  conceptual  advance  that  has  important  ramifications  for  cancer  drug  discovery. 
And,  most  importantly,  we  are  not  done  yet. 
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